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Abstract 

In this article we consider a game theoretic approach to the Risk-Sensitive Benchmarked Asset Manage¬ 
ment problem (RSBAM) of Davis and Lleo [B]. In particular, we consider a stochastic differential game 
between two players, namely, the investor who has a power utility while the second player represents the 
market which tries to minimize the expected payoff of the investor. The market does this by modulating 
a stochastic benchmark that the investor needs to outperform. We obtain an explicit expression for the 
optimal pair of strategies as for both the players. 
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1 Introduction 


In this article we shall develop a game theoretic version of a continuous time optimization model 
with risk-sensitive control approach more specifically termed as Risk-sensitive control portfolio optimization 
(RSCPO). The RSCPO balances an investor’s interest in maximizing the expected growth rate of wealth 
against his aversion to risk due to deviations of the realized rate from the expectation. The subjective notion 
of investor’s risk aversion is parameterized by a single variable, say 8. More formally, we write the finite 
horizon risk-sensitive optimization criterion as : Maximize, 


Jr,h ■= log£[e 9F{T ’ h '>] 

where F(T, h) is the tirne-T value reward function corresponding to control h. In the optimal investment 
problem we take F(T, h) = log V(T) where V (t) is the time t- value of the portfolio corresponding to portfolio 
asset allocation h. An asymptotic expansion around 8 = 0 for the above criterion yields 

J T ,h = E[F(T, h)} - Var(F(T , h)) + 0(8 2 ) 

From this expression it is clear this criterion compromises between maximizing the portfolio return while 
penalizing the riskiness . The optimal expected utility function depends on 8 and is a generalization of 
the traditional stochastic control approach to utility optimization in the sense that now the degree of risk 
aversion of the investor is explicitly parameterized through 8 rather than importing it in the problem via 
an exogenous utility function. Values of 8 > 0 correspond to a risk-averse investor, 8 < 0 to a risk-seeking 
investor and 8 = 0 to a risk-neutral investor who maximizes 

J T , h := E[F(T, h)} 

There has been a substantial amount of research on the infinite-time horizon ergodic problem: 

max J 0 o where 

Joo = lim inf — tT -1 \ogE[e~ 8F ^ t,h ^} 

£->-oo 0 

Though these type of problems are interesting in their own right, they are not readily applicable to practical 
asset management because of non-uniqueness of optimal controls. 
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In the past decade, applications of risk-sensitive control to asset management have proliferated. Risk- 
sensitive control was first applied to solve financial problems by Lefebvre and Montulet m in a corporate 
finance context. Fleming [S] was the first to show that some investment optimization models could be re¬ 
formulated as risk-sensitive control problems. Bielecki and Pliska [2] considered a model with n securities 
and m economic factors with no transaction cost. They were the first to apply continuous-time risk-sensitive 
control as a practical tool that could be used to solve “real-world” portfolio selection problems. They con¬ 
sidered a long-term asset allocation problem and proposed the logarithm of the investor’s wealth as a reward 
function, so that the investor’s objective is to maximize the risk-sensitive (log) return of his/her portfolio. 
They derived the optimal control and solved the associated Hamilton-Jacobi-Bellman (HJB) PDE under the 
restrictive assumption that the securities and economic factors have independent noise. In [3], Bielecki and 
Pliska went on to study the economic properties of the risk-sensitive asset management criterion and then 
extended the asset management model into an intertemporal CAPM in [2]. Fleming and Sheu [7] analyzed an 
investment model similar to that of Bielecki and Pliska [2]. In their model, however, the factor process and 
the security price process were assumed correlated. A major contribution was made by Kuroda and Nagai 
m who introduced an elegant solution method based on a change of measure argument which transforms the 
risk sensitive control problem into a linear exponential of a quadratic regulator. They solved the associated 
HJB PDE over a finite time horizon and then studied the properties of the ergodic HJB PDE related to J/o. 
Recently, Davis and Lleo [5J applied this change of measure technique to solve, for both the finite and an 
infinite horizon, a risk-sensitive benchmark investment problem (RSBAM) in which an investor selects an 
asset allocation to outperform a given financial benchmark. In the Kuroda and Nagai set-up 9 represents the 
sensitivity of an investor to total risk, whereas in the RSBAM, 6 represents the investors sensitivity to active 
risk i.e. additional risk the investor is willing to take in order to outperform the benchmark. It is obvious 
that for outperforming a stochastic benchmark, an investor will have to modify his or her optimal trading 
strategy. Then the question of interest to us is: “What is the investor’s worst case strategy for an opposing 
stochastic benchmark”?. In particular, one can even take the jaundiced point of view that the benchmark 
will be set retrospective to the worst case. For example, if a portfolio fund manager outperforms the set 
benchmark, the principal may remark this out-performance either as best achieved or poorly achieved with 
respect to the underlying worst-case scenario. So, in this article we consider a game-theoretic version of the 
problem within the benchmark framework of Davis and Lleo [B]. In it, we consider a stochastic differential 
game between two players, namely, the investor (who has a power utility) and a second player, representing 
the market, who tries to minimize the expected payoff of the investor. We explicitly characterize the optimal 
allocation of assets and the optimal choice of benchmark index. 

In this article, we consider the benchmark process ex-ante that evolves according to a controlled 
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diffusion process. We contrast this approach to the one of Heath and Platen m- In their methodology, 
they use the growth optimal portfolio itself as a benchmark which is closer to the concept of the numeraire 
portfolio. Although there has been a long history of applying risk-sensitive optimal control to problems in 
finance, a game-theoretic version of such problems in finite horizon is missing from the literature. We intend 
to elaborate further on this now. 

In the next section we briefly describe the framework of the risk-sensitive zero sum stochastic differ¬ 
ential game corresponding to the desired game (Pl)( refer 2.8a). In the third section we reformulate the 
objective criterion under evaluation as a linear exponential of quadratic regulator problem (P2) (refer 3.11). 
In the fourth section we provide a verification lemma that will help us solve this game problem. In the fifth 
section we derive the optimal controls and obtain an explicit expression for the associated value of the game. 
The article as usual concludes with remarks and pointers to future direction of work. 

Broadly speaking our aim is to derive the saddle-point equilibrium pair for the game (PI). To achieve 
this, we first obtain saddle point strategy for the game (P2). We then show that the saddle point equilibrium 
for (P2) is also saddle point equilibrium for (PI). 


2 Risk-sensitive zero sum stochastic differential game 

We consider a market consisting of m + 1 > 2 securities with n > 1 factors. We assume that the set of 
securities includes one bond whose price is governed by the ODE 

dS° = r t S°dt , 5° = s° (2.1) 

where r t is a deterministic function of t. The other security prices and factors are assumed to satisfy the 
following SDE’s 

n+ra 

dSl = Sl{(a + AXtfdt + £ aldW t k }, Sj =s\i = 1, ...,m, (2.2) 

k =1 

where the factor process X t satisfies, 

dX t = {( b + BX t )dt + A dW t }, X 0 = x e IT (2.3) 

Here Wt = (Wt)k=i,...,n+m is an n+m dimensional standard Brownian motion defined on a filtered probability 
space (D, X, P, Tt). 
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The factor process can represent macro-economic indicators such as GDP, inflation and market index 
data. The stock price dynamics are modulated by the factor process. Hence one can incorporate the effect 
of macro-economic indicators into the investment optimization problem by using the stock price process 
modulated by the factor process X t . 

The model parameters A, B , A are respectively mxn,nxn,nx ( m + n ) constant matrices and a £ R m , 

b £ R”. The constant matrix {o' l k )ii=i t 2 ....,m;k=i, 2 ,...,(n+m)} will be denoted by H in what follows. 

In Kuroda and Nagai m it is assumed that the factor process and the stock price process do not 

have independent noise i.e. HA ^ 0. This assumption is in sharp contrast to Bielecki and Pliska [2j who 

conversely assume that HA = 0. We will assume that HA ^ 0. 

Let Qt = a(S u , X u , L1;u < t ) be the sigma-field generated by the underlying stock price process, 

factor process and benchmark process L 1 to be defined later up to time t. The investment strategy which 

represents the proportional allocation of total wealth in the i th security S 7 t is denoted by h\ for i = 1, ...,m. 

Strategy (h$,h t ) o <t<T is said to be an investment strategy up to time T. We set S t := (Sj,S 2 ,..., S'™) , h t := 

(hj ,..., h™) . The space of controls %(T) consists of R m -valued controls for the investor as follows: H(T) is 

the set of {£?[0, T] g) Gt }a>o}-progressively measurable stochastic processes such that i K. + = 1 an£ l 

1 

where P(Jq \h s \ 2 ds < oo) = 1 V T < oo and E[e^ h s dsj 2 < 

For given h £ 'H(T), the process V t = V t h represents the investor’s wealth at time t, under the control 
h, and satisfies the following SDE dynamics, 

dV h / / 

= (rt + h t ((a + AX t )-r t l))dt + h t EdWt;VQ=v 

which can be rewritten as, 

dV h i , 

—= (r t + h t d t )dt + h t T,dW t ;Vo = v (2.4) 

Ft 

where d t = a + AX t — r t l. From equation (12.41) it can be seen that if a + AX t = r t 1 i.e. d t = 0, then the 
portfolio wealth process evolves with drift equal to the riskless interest rate r t . We make an assumption 
here that the securities price volatility matrix H is a full rank matrix. If it is not full-rank then h H = 0 for 
some h ^ 0. Hence the market contains redundant asset (s) and the portfolio value process V t h will grow at 
a rate different than the risk-less interest rate r t when h d ^ 0 resulting in an arbitrage. This is the case if 
the portfolio contains two or more redundant assets for example a stock and an option on the same stock. 
Hence we remove redundancy till the resultant matrix H is of full rank thereby ensuring that there exist no 
further possibility of arbitrage by trading in the resultant portfolio. In our benchmark model we express the 
objective through a new optimization criterion corresponding to a reward function F which represents the 
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log excess return of the asset portfolio over its benchmark and is given as 


F(t', h, 7 ) = log yy F(0; h, 7 ) = log / 

We now formally state the Risk-sensitive Benchmarked Asset management problem (RSBAM) that we solve. 

Problem : Risk-sensitive Benchmarked Asset Management (RSBAM) 

We first define the objective criterion J as, 




T‘o g^I)] 


(2.5) 


where the utility function U(-) is U : x —> x 2 . The dynamics of the benchmark process is a diffusion process 
L 7 modulated by a (Markovian) control 7 given by 


1 r 7 

T y — (at + /3tXt)dt + 7 t dWt (2.6) 

L t 

where at G R and /3 G R lxn . The space of controls T(T) consists of the market control represented by 7 that 

is K” +m -valued. T(T) consists of progressively measurable controls measurable w.r.t to {£>[0, T] (g> Gt} t>0 

t 1 

and where P{Jq | 7 s| 2 ds < 00 ) = 1 V T < 00 and E[e e2 fo T»T '» ds ] 2 < 00 . 

By a simple application of Ito’s formula we have: 


dF(t, h, 7 ) = dlog(^r) 


F(t, h, 7 ){[r t + h t '(a + AX t - r t 1) - (a t +/3 t X t ) - ^h t EE h t + ^ 7 t 7 t ]dt 
(h t Z-y t )dW t } (2.7) 


We are now in a position to formally state the game-theoretic version of the game. For a given 9 > 0, 
we consider a stochastic differential game between two players, namely, the investor (who has a power utility) 
U and who modulates the payoff for given 7 G T(T) via control h G T-LiT). On the other hand the second 
player, say the market, behaves antagonistically to the investor by setting a benchmark for the investor to 
outperform by modulating the control 7 for a given control h. This can be conceptualized as a risk-sensitive 
zero sum stochastic differential game between the investor on one side and the market on the other and is 


6 



formalized as follows 


Problem (PI) Obtain h £ 7i(T) and 7 £ T(T) such that, 


J(f,x,h,f,T) = sup inf log£[( 7 ^) 2 ] = inf sup \ogE[(^-) 2 ] (2.8a) 

hen(T) 7er(r) 0 L T 7er (T) h ^u{T) V 


,Vit 


'V T 


This can be construed as a game-theoretic version of the RSBAM problem. 

Remark 2.1: 

The problem set up (PI) is an extension of Kuroda and Nagai El and Davis and Lleo [5j. However the 
former does not consider the benchmarked version i.e. the benchmark index is identically one in m while 
in Davis and Lleo [ 6 ] though have a benchmarked portfolio criterion, they solve the one player optimization 
problem and not the two player saddle point problem. 

In light of the mathematical preliminaries just discussed, we formally elaborate the plan to solve the 
zero sum stochastic differential game (PI). 

Step 1 We reformulate the original objective criterion as a power utility function to an exponential of an 
integral function. 

Step 2 Define a new path functional /(/, x, h,^,t]T) (refer equation (13.91) 1 related to the exponential of the 
integral function. Define u(t, x) to be the upper-value function while u(t, x ) be the lower-value function for 
the game associated with I. Denote the game related to this objective functional as (P2). 

Step 3 Deduce the HJBI PDE corresponding to game (P2)( refer (3.11). 

Step 4 Formulate the conditions that a candidate value function should satisfy for the game with regards to 
objective function I to have a value. This constitutes the verification lemma. 

Step 5 Solve the HJBI PDE derived in step 3 while obtaining the expression for optimal controls. This 
optimal control pair will constitute a saddle point equilibrium for (P2). The candidate value function 
satisfying all the conditions of the verification lemma is our desired value function for (P2). 

Step 6 Reverting back to the original problem (PI), show using facts derived in Step 4, that the game with 
objective criterion J now has a value as well, and is in fact it(0,a;). 

In the next section we reformulate the objective criterion and formalize our game problem. 


3 Problem Reformulation 


Step 1 

We will first transform the utility optimization problem (12.51) into optimizing the exponential-of-integral 
performance criterion. 

Criterion under the expectation 
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Our first aim is to write the objective criterion J only in terms of the factor process. Towards that end we 
define the function g{x , h, 7 , r; 9) as follows: 


\ Q , , , 1 Q f t ' 

g(x,h, r y,r]9) = — (—+ l)/i EE ft — r — ft (a + Ac — rl) + (a +/9a:) — — — (ft £7 + 7 £ ft) 


+ 


1,9 

2 ( r 1)77 


(3.1) 


From m and m we therefore have, 


dexp(-^-F(t;h,'y)) = | (g(X t , ft t , j t , r t ; 9) - (ft t E - 'y t )T,dW t ^j - ^-(ft t £ — 7 t )EE (S h t — ^ t )dt 

(3.2) 


Thus we have, 


exp(-^.F(t;/i,7)) = / 9/2 exp{| J g(X s ,h s ,'y s ,r;9)ds 

~ \f (K^ - ^dWs - j^ (ft'.E - %)(h s X - ^’da} 

where Vq = v, Lq = l and / = -^ r = j- 

Change of measure 

Let P ^ 1 ’ 7 be the measure on (f2, T) defined by, 


dF h ’~< 

dr ^ 


x t . 


(3.3) 


(3.4) 


where X\ is given by 

Xt = £& f (hZ- 7 ')dW) t (3.5) 

* Jo 

and where £(•) denotes the Doleans-Dade or martingale exponential. From the assumption made on the 

/ / 

(*f a ^s ^ 7g .... 

space of admissible controls "H(T) and T(T) it is clear that the Kazamaki condition E[e ->0 u 2 dWs ] < 00 
Vf G [0,T] is satisfied so that P ^’ 7 to be a probability measure, i.e. 

S[£(^ jT(/i , S-7 , )d^) T ] = l. 
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(3.6) 




We note that, 


W?' 1 ~Wt + ^ j* {h' s T, - -y' s )ds, (3.7) 

by Girsanov’s formula, is a standard Brownian motion under P 71,7 and the factor process X t satisfies, 

dX t = (b + BX t - 6 -{H 'ht - 7 t)jdt + AdW/*' 7 (3.8) 


Step 2 

The HJB equation 

Taking expectation w.r.t to the physical measure P and multiplying both sides of equation (13.31) by = jr 
followed by the change of measure argument of (13.4113.51) one considers the new path functional I defined as 


/(/,x,/i, 7 ,t,T) = log / 


1 log E h ^[ex p {°- g{X s , h a , j s , r s+t ;0)ds}] 


(3.9) 


and then the upper-value function and lower-value function u and u respectively for the game corresponding 
to the new path functional I are given by : 


u(t,x) = sup inf I(f,x,h, / y,t,T) 
hen(T) 7 er(T) 

(3.10a) 

u(t,x) = inf sup /(/, x, h, 7 , t, T) 

7er (T) h&H(T) 

(3.10b) 

u(t, x) = u(t, x) = u{t, x) 

(3.10c) 


If a pair of controls satisfy (3.10c), then the game corresponding to the new path functional I has the 
value u and the pair of controls constitutes saddle point strategies for the game with regards to I. Let the 
exponentially transformed function I be defined as I = exp(—§/) and u(t,x) := exp(—^u(t,x)). We now 
consider the problem of determining the saddle-point equilibrium for the game corresponding to the new 
path functional I. We call this problem (P2) and it is formally stated as follows: 

Problem P2 Obtain h £ HIT) and 7 £ T(T) such that, 


u(t,x ) = inf sup I(f,x,h,'y,t,T) 

heH(T) 7sr (T) 

= sup inf I(f,x,h,'y,t,T) 

7 er(T) h£H(T) 

= £^’ 7 [exp{^ g{X s ,h s , A / s ,r s+t ;9)ds}f~ e/2 } (3.11) 


We now provide a verification lemma for this game. Let us first define the process Y h,1 {t ) by 


dY h 'i(t) = ^ 


dt 
dX t 


dt 


(b + BX t - Uh’ t T, - 7 '))di + AdWf’ 7 
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Let y =(t,x). The control process h{t) = h(t,oj) and 7 (t) = 7 for w £ O can be assumed to be 
Markovian. Let O = (0, T) x R ra . Then the process Y h,1 (t) is a Markov process whose generator Af 1 ' 1 acting 
on a function u(t,x) £ Cq([0,T] x R ra ) is given by, 


A h, ' y u(t,x ) = X ^ + (b + Bx — ^A(S h — 7 )) Du(t , a:) + ^tr(AA* D 2 u(t, x)) (3-12) 

in which Du(t,x) = ( 9, g*i X \ ■ ■■, 9 ) an d D 2 u(t,x) is the matrix defined by D 2 u(t,x) = b J = 

1 ) 2,n. 

Step 3 

By an application of the Feynman-Kac formula, it can be deduced from (13.111) that the HJBI PDE 
for u(t, x) is given by 

(^A h ’ 7 + 6 -g{x, h, 7 , r; 6»)^ u(t, x) = 0 (3.13) 

Reversing the exponential transformation , dividing by — (9/2)u(t,x), we can deduce from (3.13) that the 
HJBI PDE for u(t, x) is given for h £ R m and 7 £ jj( m + n ) by 

A hA u(t, x) = 0 (3.14) 

where the operator A hn is given by, 

A h, ' y u(t,x) = X ^ + (b + Bx — t^A(£ h — 7 )) Du(t , x) + Ar(AA D 2 u(t, x)) 

6 ' > 

— —(Du(t, x)) AA Duff, x) — g(x, h, 7 , r; 9) (3.15) 

In the next section we provide a verification lemma for the game based on the criterion function I. 


4 Verification lemma for the game PII 


Step 4 

We now provide a verification lemma related to the game (PII). 

Proposition 4.1. Suppose w £ C 1,2 (0) D CffD) (is the space of twice differentiable functions on O with 
respect to x, once continuously differentiable on O with respect to t and which are continuous on O ). Suppose 
there exists a (Markov) control h(y),ff(y) such that 
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1. {A h ’^ y) + ^g(x,h,j(y),r-,9))[(w(y))\ > 0 V h £ R m ; 

2. {A h ^ + %g(x,h{y),j,r-,e))[(w(y))} < 0 V 7 € R m+ra ; 

3. (A h{v) ^ v) + § g(x,h(y),i(y),r-,6))[(w(y))\ = 0 V y £ O; 
l (w(T,X t )) = f~ el \ 

Define, 

Z{s) = Z( a) (h,i) = J g(X T , h T ,^ T ,r t + T ', 6)dr | (4.1) 

5. Dw(t + s,X s )Ae^dW^} = 0 V h £ R m ,V 7 € M m+n 

iVow, define for each y £ O and h £ T~L(T) and 7 £ P (T), 


I{f’ x ,h, 7 ,t,T) = exp(—-/(/, x, h, 7 , i, T)) 


cT-t 


= ^ >7 [exp{- / 5f(X s ,/i s ,7 s ,r s+t ;6»)ds}/ e/2 ], 


TTien (h(y), jfy)) is an optimal (Markov) control i.e., 


w(0, x) = u(0, x) = I(f, x, h, 7 ,0, T) = 


. inf { sup [/(/,£, h, 7 , 0, T)]} 
heu(T) 7 e r(r) 


= sup i rnf [/(/,*,/i,7,0, T)]} 

7 € r(T) heH(T) 

= sup I(f,x,h, 7 , 0 ,r) 

7 €r(T) 

= inf i(f,x,h,j,0,T) = I(f,x,h,j,0,T) 

hGH(T) 


Proof Apply Ito’s formula to w(s, X s )e Zs to obtain 


+ s,X s )e Zs ) = 


e' z ' , (A ,! ' 7 + -g(X s ,h s ,'y s ,r s+t -,9)) 


[( u>(t + s, X a ))]ds + e Zs ( Dw(t + s, X a ))dW a n 


rT-t 


6 


)(T,X T -t)e ZT - t = w(t,x)+ ((A ' 1 ’ 7 + -g(X a ,h a ,'y a ,r a+t \0))w(t + s,X 3 ))e Zs ds 

Jo z 


rT-t 


(Dw ( t + s, X s )K)e z ‘dW s ft ’ 7 


(4.2) 


From condition(4) of statement of the Proposition, we have w(T,Xt ) = / Taking expectation with 
respect to P ft ’ 7 , setting t = 0 and using conditions (1) and (5) of the Proposition we get 
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E h ’ r [w{T,X T )e^ T } > w{0,x) 

Since this inequality is true for all h £ HiT) we have 

inf E hs [f- e/2 e 2T ]> w(0,x) 
hen(T) 1 ’ 

Hence we have, 

sup inf E hn [f~ B ^ 2 e^ T ] > inf E h ' r \f~^ 2 e 2T ] > w( 0,x) (4.3) 

7er (T)h&U(T) heH(T) 

Similarly, setting < = 0we get, using condition (2) of the Proposition, we get the following lower bound, 


E k ^[w{T, X T )e ZT ] <w{0,x) 

Since this inequality is true for all 7 £ r (T) we have 
sup E h ^[f~ e/2 e^ T ] < w(Q,x) 

T6F (T) 

Hence we have, 

inf sup E h ^[f~ e/2 e Zr ] < sup E fl ’' 1 [f~ e/2 e ZT ] < w( 0,a;) (4.4) 

h&H(T) 7 gr(T) 7er(T) 

Also , setting t = 0 and using condition (3) of the Proposition and using the definition of ft in (13.111) we get, 


E h ^[w{T,X T )e^ T ] 


w( 0 , x) 

E h ’ 7 [exp{^ g(X s ,h s , r s+t ; 6)ds}f~ e/2 } 


(4.5) 


It is automaticaly true that 


sup inf E hyl \f~ 6 l 2 e^ T \ < inf sup E h ^\f~ e l 2 e ±T ]. (4.6) 

7 er(T) h &'H( r r) h£H(T) 7 er(T) 
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Conversely, from (14.31) . (14.41) and (14.51) we have, 


inf sup E h,1 [f e */ 2 e^ T ] < ui(0,x) < sup inf E h,1 [f e / 2 e^ T ] 

hau{T) 7gr(T) 76r (T)heH(T) 


(4.7) 


Hence from (14.61) and (14.71) we have, 


sup inf E h,1 [f~ e ^' 2 e^ T ] 
7 er(T) h&H(T) 


inf sup E h ^\f- e/2 e 2T } 

heH(T) 7gr (T) 

w(0,a;) = E h ^[f- 6/2 e ZT ] 


(4.8) 


Corollary 4.2 Admissible(optimal) strategies for the exponentially transformed problem given by \3.11\) 
are also admissible (optimal) for the problem (3.10c). Formally, 


m(0, x) 


sup { inf [/(/,x,h, 7 , 0 ,T)]} 

heH(T) 7 er(T) 

inf { sup [I(f,x, h, 7 , 0 , T)]} 
7£ r ( T ) hen(T) 


inf 

7er(T) 

sup 

h£U(T) 


I{f,x,h, 7 , 0 ,T) 

I(fi x i h,f/, 0, T) = I(f,x, h, 7 , 0, T) 


Proof The value function u and u are related through the strictly monotone continuous transformation 
u(t,x) = exp(—| u(t,x)). Thus admissible (optimal) strategies for the exponentially transformed problem 
are also admissible(optimal) for the problem (3.10c). ■ 


5 Solving the risk-sensitive zero sum stochastic differential game 


Step 5 

We seek to find the value function u for the game defined in (13.121) . We guess a solution assuming that 
it belongs to the class C 1 , 2 (( 0 , T ) x M”) and show that the guess satisfies all the conditions of our verification 
lemma given by Proposition 4.1. Conditions (l)-(4) of the verification lemma can be written in a compact 
form as 


sup inf A h,1 u{t 1 x) = 0; u(T,x) = log f (5.1) 

heu(T) 7er(T) 
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Motivated by the results in Kuroda and Nagai m, we will look for a u given by u(t, x) = \x Qtx + q t x + kt 
where Q is an n x n symmetric matrix, q £ R" and k is a scalar. Substituting this form in (13.151) we get 


A h ' 1 u(t, x) 


r^r x+ ^ x+ ^ + { b+Bx ~ d 2 h{ ^ ht - {QtX+qt) 

i(AA Q t Q t A A) - ^(Qtx + k t ) AA (Q t x + k t ) 

10// / \ 0 / f f 

-(- + l)h t SS h t + r t - (a t + fix) + h t (a + Ax - r t 1) + 7 + 7 £ h t ) 

(5.2) 


Remark 5.1 Since the game considered is for the risk-averse investor 6 > 0. Moreover based in the 
expression for 7 in \5.5\) . 9 ^ 2. This leaves for two possibilities: 9 £ (0, 2) or 9 £ (2, 00 ). For the optimal 
strategies (h, 'y) to be a saddle-point equilibrium for the game, we would desire that the equation with the 
quadratic term in h be negative definite while the quadratic term in 7 be positive definite. In fact for the 
choice 9 > 0, the quadratic term in h desirably is negative definite while for 9 < 2, the quadratic term in 7 
is positive definite . Hence for our case the valid range of 9 is between 0 and 2 and excludes the other two 
possibilities for the range of 9. 

We now solve the first order condition for 7 to minimize A h,1 u(t,x) over all 7 £ R n+m : 


(2 - 0 ) 7 t - 0(£ fit— 7 )Du(t, x) = 0 


(5.3) 


The first order condition for h that maximizes A h ’^^u(t,x) over all h £ K m in terms of u(t,x) is, 

k = ^^(SS'r 1 ^ + ^£7 1 - \^Du(t,x)} (5.4) 

Substituting back h obtained in ( 15 . 41 ) into ( 15 . 31 ) we get 

7 1 = k ~ A Du(t,x)] (5.5) 

The optimal control ht is a global maximum while 7 t is a global minimum for t < [0,T]. We substitute h 
from ( 15 . 41 ) and 7 from ( 15 . 51 ) in ( 15 . 11 ) to obtain 

A^’^uft, x) = 0; u(T, x) = log / (5.6) 


We then group all the resulting quadratic terms in x, linear terms in x and constants together to conclude 
that the choice of u(t, x) = ^x Qtx + q t x + kt is indeed the solution to the HJBI PDE (15.11) provided that 
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Q , q and k satisfy the following system of differential equations: 

• a matrix Ricatti equation related to the coefficient of the quadratic term and used to determine the 
symmetric non-negative matrix Q t , given as 


^ = Q t K 0 Qt + K[Q t + Q t Ki + 2 2 6 /(EE" 1 ) ^4 = 0 0 < t < T, 

at (2 — 0 2 ) 

Qt = 0 (5.7) 


where 

K 0 = 


2 ( 2 - 8 ) 


K i=B- 


AA 


28 


2 8 2 


(2-9 2 ) 


( 2 — 6 >)( 2 — 6 > 2 ) 
t A'(SE') _1 EA' 


t AE'(EE') *ea' 


The following linear ordinary differential equation satisfied by the n element column vector q(t) 


dg t 

dt 


+ (K 1 + Q t K 0 )q t + Q t b + (a — r(f)l) (EE ) [——-—^HA Q{t ) + 

- A 


qr = 0 


(5.8) 


• The following linear ordinary differential equation satisfied by the constant k t 

^ + ^r(AA'Qt) + r t -a t - (2 J g 2 ^ 2 (« - r(t)l)'(EE') EA'g(t) 

+ ^-^(a-r^l) -1 ^') (a ~ r(f)l) + _ ^2)2 g'ft) A E'(EE') EA'g(f) 

“ 4(I^) , ' ( ‘ )AA '' ,( ‘ ) 

= log/ (5.9) 

Condition 4 of Proposition 4.1 in terms of u imposes the terminal condition in (15.91) . 

If Kq is positive definite then a unique solution to the Riccati equation & Qt , exists for all t < T. 
This property of positive definiteness follows from interpretation of the solution Q t as the covariance matrix 
of observations from a Kalman filter used to estimate the state of a dynamical system (see Theorem 4.4.1 in 
Davis |5) for details. The uniqueness property of Qt follows from the standard existence-uniqueness theorem 
for first order differential equations (see Proposition 4.4.2 in Davis [5]). 

It remains to be seen if u = exp(—|u) for the choice of u satisfies condition (5) of Proposition 4.1. 
Proposition 5.2 E hr '[j^ t e^ B (Du (t + s, X s )K)dWg'^\ = 0. 

Proof From the definition of u in (13.111) . for any optimal control belonging to r(T), the strategy h = 0 
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is sub-optimal, and hence will provide an upper bound on u. Further for the zero-benchmark case namely, 
7 = 0 , we would obtain now an upper bound on u 


l(t,x) = Jrf^ 7 [e*p{§ J g(X a ,h a ,*f s ,r a+t \0)ds}f 0/2 \ 


< 


£' 0 , 7 [exp{^ g(X s ,0,%,r s+t ;6)ds}f 0/2 ] 


r T-t 


u(t, x) < -E 0 , 0 [exp{- / g(X s , 0, 0, r s+t ; 9)ds}f e/2 ] 


r T-t 


= exp(-A / r s+tds)f 0/2 


Now Q and q are solutions to the system of o.d.e, and hence are integrals of bounded functions . Hence 
Q and q are continuous functions of time t £ [0, T] and hence bounded on [0,T]. The matrix A is a 
known constant. From standard existence-uniqueness result of stochastic differential equation (refer Oksendal 
f[I3m we have X £ L 2 (fl,J-,P h, ' r ). Hence from the upper bound on u , Remark 3.1 and the fact that 
Du(t,X t ) = QtX t + qt is in L 2 (fl, J 7 , P^’ 7 ), we have that E h,1 (\Pu Ae^, Du Ae^] t ) < oo 'it £ [0,T]. Hence 
we have E h ’ 7 [J^ 1 Du (t + s,X s )Ae^ a dWg’' y ] = 0. ■. 

It is clear that our guess for u = exp(—|w) satisfies conditions (l)-(5) of Proposition 4.1. Hence our choice 
of u indeed is the value of the game (P2) and controls h, 7 are the saddle point equilibrium of this game. 
Lemma 5.2 For the choice of space of controls T-LiT) and r(T), we have 

[iQtXt + qt)A + (/l ‘ S " ^ dWt ) J = 1 (5-10) 

Proof: From the Kazamaki condition, refer (Oksendal 03]), (5.10) holds if 

J?[exp(J Q t 6( 7 a ) ^pt/^)] <00 \f t £ [0,T]. Hence by application of Cauchy-Schwartz inequal¬ 

ity we have, 

+ + ~ "Is) )dIF s )] < (E[efo e ( Q ‘ Xs+q ^ Adw ‘]) 1 / 2 (E[ef ° e( ' h '‘‘ j: -' y '°' >dWs ]) 1 / 2 



However for E[e^o 9{QsX a +q e )hdw e ^ < 00 to hold , it is enough to show that the Novikov condition given by 

E[efo 8 (.QsX a +q„)AA (QsXs+gsjdsj qq re f er (Oksendal 14] ). Since X is Gaussian process and Qt and 

q t are deterministic, ( QtX t + q t )A is Gaussian and hence by completion of squares argument detailed in 

Theorem 5.3 below we have E[efo s 2 (Q s x a +q s )AA (Q a x a +q„)ds^ < ^ anc [ h ence E[efo 6 (Q*x.+q a )\dw.^ < 

/ / 1 /2 

c» Vf € [0,T] is validated. (E[e^o 7») dwr «]) < 00 is validated from similar application of Cauchy- 

Schwartz inequality followed by the assumption made earlier in the definition of the space of controls hUT) 
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and T(T). Thus the Kazamaki condition holds and the conclusion follows. 

Theorem 5.3 If there exist a solution Q to 7|). then the strategies (h, 7 ) defined by 


ht = ')~\d t + ^71 - ’(Q t X t + q t )} 


(5.11) 


7 1 


e 

2-9 


[Xht 


A (QtX t + q t )\ 


(5.12) 


where q is a solution of 115.81) are admissible i.e. h £ 1~L(T) and 7 £ T(T ) and are optimal for the finite 
horizon game problem (PI ), namely, 


u( 0,x) = sup inf J(f, x, ft, 7 , T; 9) 

hen(T) Ter(T) 

= inf sup J(f,x,h,^,T\9) 

i£T(T) hen(T) 

= inf J(f,x,h,"/,T;9) 

7er (T) 

= sup J(f ,x,h,fi,T\9) 

h&H(T) 

= J(f,x,h,fi,T;9) 

= i x Q 0 x + q 0 x + k 0 


Proof The controls derived in section 5, (h, 7 ) forms the saddle point equilibrium for the (P2) game . We 
aim to show that these controls are in fact admissible and optimal for the problem (PI) as well. 

Proof of admissibility From the expression for h and 7 in (15.111) and (15.121) respectively we note that 
— |((QtXt + qt) A + (h t T> — 7 t )^ can be written linearly in X t as X t v( + v% where, constants vj and 
vf are given by, 


v\ = ~Q'm+ tJlA'(EE')- 1 SA' + tJlg'^AS'^sV^a-rl) 

(2 2 -lfi 2-9*) Q ' {t) - (SS ')' _1 EA ' g( f ) - j^e)Q'mhq(t). 
v* = ~q (t)A + 9 ^ d _ ^ (a - rl)'(EE')~ 1 SA'g(t) 

~ ( 2 % -(^«'(oaa' 9( o 


Since X satisfies the SDE , dX t = (b + BX t )dt + A dW t , so E\X t \ < E |X(0)| + \b\T + \B\ f* E\X s \ds. 
By Gronwall’s inequality, therefore E\X t \ < (E|X(0)| + \b\T) exp(\B\t) and Cov(X t ) = A At. Let = 
v\Xt-\-vf. We now explicitly calculate E[e 6 ^ } for some 5 > 0 since from Remark 2 in Lemma 2, of section 12 
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(Gihman and Skorokhod [9]) would imply that the Novikov’s condition holds true. Let Rt = e Bt X t + e bt . 
Hence dRt = e~ Bt AdWt- Therefore Rt is a Gaussian process and hence 4>t is Gaussian process with drift. 
Also p t = E[\(j)t\\ < sup 0 < t < T \vl\(E\X 0 \ + \b\T) exp(\B\t) + sup 0 < t < T |u t 2 | and E t = Cov^t) < v 1 t J6Av}. 
Thus mean fi t and co-variance E t are bounded above by t. We use the following completion of squares 
argument: \z Az + b z + c = ^(z + A~ x b) A(z + A _1 b) + c — \b A~ x b . 

E[e s l^l 2 ] 


Matrix (E t E t ) is symmetric positive definite with lowest eigenvalue say A min- Then it is easy to show that 
for 6 < x ™ in , matrix (—261 + (E t EJ _1 ) _1 is positive definite . Along with the derived fact that fit and E t 
is bounded above by t < T , hence there exists some constant C such that E[e s ^ 2 ] < C. Hence the optimal 
controls h, 7 belong to their respective admissible class viz. R(T) and T(T) respectively. ■ 

Proof of optimality Define, 


/R" 


27r n/2 |E t E;| 1 /2 


e < 5 | 0 |?(E t E t ) 1 {<l>-tH)dx 1 dx 2 ...dx n 


-<t> (-25/+(£ t £ t )- 1 ) (t)(£ t £ t )- 1 *-/* t (£ t £ t )- 1 w 

< - dx ....dx 


2 7 r n/ 2 |E t E ;| 1 /2 

i(s;s t )r i/2 x 

l(- 2 « + (E i E;)-i)-if 1/2 

-Mt(S t £t)-G t +4>»t( s t s t)~ 1 t-2^r+(St£f)- 1 )~ 1 (£t£t)-Gt 


= Z s (h, 7 ) = 


9 


9 


g{X T , h T , r ) T ,r t + T ]9)dT - (h T E - 7 T )dW T 


~ 1 0 ~ 7r) (V S - 7r) dr 


(5.13) 


Also define, %(i, x) = — ^(u(t,x) — log/) and Lu(t,x) = ^tr(AA D 2 u(t, x)) + (b + Bx) Du(t,x) 
Hence, we have 


dx(t + s,X s ) 


9 ,du , . . ,, . , 

- 2 (-gf + + s > A s )ds 


9 


Du(t + s , A',,) AdW s 


Hence, 


6 du 6 t 

= “2 + Lu)(t + s,X s ) - -Du(t + s, X s ) AdW s 

q2 

+ — Du AA Du(t + s, X s )ds 

8 


dexp{x(t + s,X s )} 
exp {x(t + s,X s )} 
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and so, 


dexp{xft + s,X s )}exp{Z(s)} 
exp {x(t + s, X 5 )} exp{Z(s)} 


9 du 6 ' 

2+ Lu )( t + s ’ X s) ~ -Du(t + s,X s ) A dW s 

q2 q 

— Du AA Du(t + s,X s )ds + -g(X s , ft s , 7 s ,r s + 1; 6)ds 
o z 

9 i / 9^ / ii 

-(ft (s)E - 7 (s))dW s + —(ft (s)E - 7 (s))A Du(t + s,X s )ds 


Hence from (I3.15|l . we have, 


exp{x(T, X{T- t)) + Z(T -t)} = exp (x(t, x)) exp 


[ ~^-{A hl u{t + s,X s ))ds 
Jo ^ 


pT-t 


[Du(t + 5 , X a ) A + (h t E - 7 t )]dW t 


rT—t n 2 111 1 / / / 

J Y^ Du ( t + S,X J + ( h 't T ‘- r Y't)][ Du (t +S, x s)' + {h' t T, - %)]'ds 


(5.14) 


We have shown that u satisfies conditions (l)-(5) of Proposition 4.1 Hence from condition(4) of Proposition 
4.1, we have x(T, a?) = 0. Now setting t = 0 and taking condition (1) of Proposition 4.1 into account for 
7 = 7 , and for any ft € i-L{T) we see from (15.141) that 


V h 

' V T 


-e /2 


(-$-) > e _ 2 “(°> x ) exp 


J ^lDu(s,X s )'A+(h s Z 



y[L>u(s,X s ) +(ft s E 


7 s )][Hu(s,X s ) +(ft s E 


rl)]dw s 

rl)]'ds 


Now by taking expectations w.r.t to the physical probability measure P on both sides of above equation and 
using Lemma 5.2, we obtain 


J(f,x,h, 7 ,T) < u(0,x) 

This inequality is true for all ft € H(T) so we have, 
sup J(/, x, ft, 7 , T) < u(0, x) 

heH(T) 


19 











Hence we have, 


inf sup J(f,x,h,'y,T)< sup J(f,x,h,'y,T)<u(0,x) (5.15) 

7£r(T) h£H(T) h£H(T) 


Likewise, setting t = 0 and taking condition (2) of Proposition 4.1 into account for h = h, and for any 
7 £ r(T) we see that 


J{f, x ,h,lf,T) > u(0,x) 


This inequality is true for all h £ T~L(T) so: 


inf 

7er(T) 


J(L x i h, 7 , T) > u( 0,x) 


Hence we have, 


sup inf J(f, x, h, 7 , T) > inf J(f,x,h,'y,T)>u(0,x) (5.16) 

hen(T) 7er(T) 7£r(T) 


Hence from (15.1511 and (15.161) we have, 


sup inf J(f,x,h,'y,T)>u(0,x)> inf sup J(f,x,h,'),T) (5-17) 

hen(T) 7£ r ( T ) 7£r(T) hen(T) 


Moreover, setting t = 0 and taking condition (3) of Proposition 4.1 into account for h = h, 7 = 7 (such that 
h £ 'H(T) and 7 £ r(T)) we see that 


J(fi x , h, 7 , T) = u(0,x) 


(5.18) 


It is always true that 


sup ( inf J(/,r,/i, 7 ,T))< inf ( sup J{f,x,h,j,T)) (5.19) 

h£H(T) jer(T) -yer( t) hen(T) 


Hence combining (15.171) and (15.191) we deduce the final conclusion that the game (PI) has a value and is 
m(0,it). ■ 
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6 Conclusion 


In this article we provide a two player zero sum stochastic differential game in the context of the risk-sensitive 
benchmark asset management problem. We obtain an explicit expression for the optimal strategies for both 
the players. Future work could be directed towards considering a game theoretic benchmark problem with 
infinite horizon risk sensitive criterion. 
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